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Abstract. This topical review describes the methodology of continuum variational 
and diffusion quantum Monte Carlo calculations. These stochastic methods are based 
on many-body wave functions and arc capable of achieving very high accurac;y. The 
algorithms are intrinsically parallel and well-suited to petascale computers, and the 
computational cost scales as a polynomial of the number of particles. A guide to the 
systems and topics which have been investigated using these methods is given. The 
bulk of the article is devoted to an overview of the basic quantum Monte Carlo methods, 
the forms and optimisation of wave functions, performing calculations within periodic 
boundary conditions, using pseudopotentials, excited-state calculations, sources of 
calculational inaccuracy, and calculating energy differences and forces. 
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1. Introduction 

The variational Monte Carlo (VMC) and diffusion Monte Carlo (DMC) methods are 
stochastic approaches for evaluating quantum mechanical expectation values with many- 
body Hamiltonians and wave functions pQ. VMC and DMC methods are used for both 
continuum and lattice systems, but here we describe their application only to continuum 
systems. The main attraction of these methods is that the computational cost scales 
as some reasonable power (normally from the second to fourth power) of the number of 
particles [2] . This scaling makes it possible to deal with hundreds or even thousands of 
particles, allowing applications to condensed matter. 

Continuum quantum Monte Carlo (QMC) methods, such as VMC and DMC, 
occupy a special place in the hierarchy of computational approaches for modelling 
materials. QMC computations are expensive, which limits their applicability at present, 
but they are the most accurate methods known for computing the energies of large 
assemblies of interacting quantum particles. There are many problems for which the 
high accuracy achievable with QMC is necessary to give a faithful description of the 
underlying science. Most of our work is concerned with correlated electron systems, but 
these methods can be applied to any combination of fermion and boson particles with 
any inter-particle potentials and external fields etc. Being based on many-body wave 
functions, these are zero-temperature methods, and for finite temperatures one must 
use other approaches such as those based on density matrices. 

Both the VMC and DMC methods are variational, so that the calculated energy 
is above the true ground state energy. The computational costs of VMC and DMC 
calculations scale similarly with the number of particles studied, but the prefactor is 
larger for the more accurate DMC method. QMC algorithms are intrinsically parallel 
and are ideal candidates for taking advantage of the petascale computers (10^^ flops) 
which are becoming available now and the exascale computers (10^^ flops) which will be 
available one day. 

DMC has been applied to a wide variety of continuum systems. A partial list 
of topics investigated within DMC and some references to milestone papers are given 
below. 

• Three-dimensional electron gas [21 HI El E] ■ 

• Two-dimensional electron gas [3 El El [ID] • 

• The equation of state and other properties of liquid '^He [lU [T2] . 

• Structure of nuclei [T3] . 

• Pairing in ultra-cold atomic gases [T5l [16] . 

• Reconstruction of a crystalline surface [17] and molecules on surfaces [TS] IT9]. 

• Quantum dots [20] . 

• Band structures of insulators [211 ESJ [23] . 

• Transition metal oxide chemistry [2 ^ 125 ] [25]. 
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• Optical band gaps of nanocrystals [271 EE]- 

• Defects in semiconductors [29l [301 SI] • 

• Solid state structural phase transitions 

• Equations of state of solids 

• Binding of molecules and their excitation energies [33 EEl ESI SOI 1^ - 

• Studies of exchange-correlation [121 SHI HH |15] . 

The same basic QMC algorithm can be used for each of the applications mentioned 
above with only minor modifications. The complexity and sophistication of the computer 
codes arises not from the algorithm itself, which is in fact quite simple, but from the 
diversity of the Hamiltonians and many-body wave functions which are involved. A 
number of computer codes are currently available for performing continuum QMC 
calculations of the type described here [IB]. We have developed the casino code 
[17] . which can deal with systems of different dimensionalities, various interactions 
including the Coulomb potential, external fields, mixtures of particles of different types 
and different types of many-body wave function. 

The VMC and DMC methods are described in section [2] and the types of many- 
body wave function we use are described in section [3] The optimisation of parameters 
in wave functions using stochastic methods which are both subtle and unique to the 
field is described in section |4| QMC calculations within periodic boundary conditions 
are described in section |5| the use of pseudopotentials in QMC calculations is discussed 
in section |6] and excited-state DMC calculations are briefly described in section [7j The 
scaling of the QMC methods with system size is discussed in section [8j Sources of 
errors in the DMC method and practical methods for handling errors in QMC results 
are described in section [9j In section 10 we describe how to evaluate other expectation 
values apart from the energy. Section 11 deals with the calculation of energy differences 
and energy derivatives in the VMC and DMC methods, and we make our final remarks 
in section I 



2. Quantum Monte Carlo methods 

The VMC method is conceptually very simple. The energy is calculated as the 
expectation value of the Hamiltonian with an approximate many-body trial wave 
function. In the more sophisticated DMC method the estimate of the ground state 
energy is improved by performing a process described by the evolution of the wave 
function in imaginary time. Throughout this article we will consider only systems with 
spin-independent Hamiltonians and collinear spins. We will also restrict the discussion 
to systems with time-reversal symmetry, for which the wave function may be chosen 
to be real. It is, however, straightforward to generalise the VMC algorithm to work 
with complex wave functions, and only a little more complicated to generalise the DMC 
algorithm to work with them |48j . 



Continuum variational and diffusion quantum Monte Carlo calculations 



4 



2.1. The VMC method 

The variational theorem of quantum mechanics states that, for a real, proper [IH] trial 
wave function \E't, the variational energy, 

/^T(R)^^T(R)rfR ... 

is an upper bound on the exact ground state energy i?o, ^-c, Ey > Eq. In equation ([T]), 
if is the many-body Hamiltonian and R denotes a 3A^-dimensional vector of particle 
coordinates. As discussed in section 3.1, the spin variables in equation ([T]) are implicitly 



summed over. 

To facilitate the stochastic evaluation, Ey is written as 

Ey = Jp{R)E^{R)dR, (2) 
where the probability distribution p is 

and the local energy, 

El(R) = ^T^^^T • (4) 

is straightforward to evaluate at any R. 

In VMC the Metropolis algorithm [50] is used to sample the probability distribution 
p(R). Let the electron configuration at a particular step be R'. A new configuration 
R is drawn from the probability density T(R ^ R'), and the move is accepted with 
probability 

^ ' \ ' T(R ^ R')^^(R') J ^ ^ 

It can easily be verified that this algorithm satisfies the detailed balance condition 

*^(R)T(R' ^ R)A(R' ^ R) = ^2 (R')2-(R_ ^ R')v4(R ^ R'). (6) 

Hence p(R) is the equilibrium configuration distribution of this Markov process and, 
so long as the transition probability is ergodic (i.e., it is possible to reach any point 
in configuration space in a finite number of moves), it can be shown that the process 
will converge to this equilibrium distribution. Once equilibrium has been reached, the 
configurations are distributed as p(R), but successive configurations along the random 
walk are in general correlated. 

The variational energy is estimated as 
1 M 

^v^ttE^l(R.), (7) 

i=l 

where M configurations Rj have been generated after equilibration. The serial 
correlation of the configurations and therefore local energies Ei^(R,i) complicates the 
calculation of the statistical error on the energy estimate: see section 9.2[ Other 
expectation values may be evaluated in a similar manner to the energy. 
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Equation Q is an importance sampling transformation of equation ([T]). Equation 
(|2| exhibits the zero variance property: as the trial wave function approaches an exact 
eigenfunction (\E't — the local energy approaches the corresponding eigenenergy, 
Ei, everywhere in configuration space. As \I/t is improved, Ei^ becomes a smoother 
function of R and the number of sampling points, M, required to achieve an accurate 
estimate of E^ is reduced. 

VMC is a simple and elegant method. There are no restrictions on the form of trial 
wave function which can be used and it does not suffer from a fermion sign problem. 
However, even if the underlying physics is well understood it is often difficult to prepare 
trial wave functions of equivalent accuracy for two different systems, and therefore the 
VMC estimate of the energy difference between them will be biased. We use the VMC 
method mostly to optimise parameters in trial wave functions (see section |4]) and our 
main calculations are performed with the more sophisticated DMC method, which is 
described in the next section. 



2.2. The DMC method 

In DMC the operator exp(— is used to project out the ground state from the initial 
state. This can be viewed as solving the imaginary-time Schrodinger equation, which 
for electrons is 

- -<I>(R, t) = (H- Et) $(R, t) = + V{R) - Erj $(R, t) , (8) 

where t is a real variable measuring the progress in imaginary time, V is the potential 
energy (assumed to be local for the time being), and E^ is an arbitrary energy offset 
known as the reference energy. Throughout this article we use Hartree atomic units 
where nie = h = \e\ = Aneo = 1, where rrie is the mass of the electron and e is its charge. 
Equation ([s]) can be solved formally by expanding $(R, t) in the eigenstates 0i of the 
Hamiltonian, 

$(R,t)=5:Q(t)0,(R), (9) 

i 

which leads to 

$(R,t) =^exp[-(E,-ET)t]Q(O)0,(R) . (10) 

i 

For long times one finds 

^(R, t ^ oo) ~ exp[-(Eo - ^t)^] Co(0)(/)o(R) , (11) 

which is proportional to the ground state wave function, 0o- 

The Hamiltonian is the sum of kinetic and potential terms: H = —{1/2)'V'^ + V(R). 
Suppose for a moment that we can interpret the initial state, J2i Cj(O)0i, as a probability 
distribution. If we neglect the potential term then the imaginary-time Schrodinger 
equation ^ reduces to a diffusion equation in the configuration space. If, on the other 
hand, we neglect the kinetic term, ([s]) reduces to a rate equation. It should not be 
surprising that a short time slice of the imaginary-time evolution can be simulated by 



Continuum variational and diffusion quantum Monte Carlo calculations 



6 



taking a population of configurations {Rj} and subjecting them to random hops to 
simulate the diffusion process, and "birth" and "death" of configurations to simulate 
the rate process. By "birth" and "death" we mean replicating some configurations 
and deleting others at the appropriate rates, a process which is often referred to as 
"branching" . 

Unfortunately the wave function cannot in general be interpreted as a probability 
distribution. A wave function for two or more identical fermions must have positive and 
negative regions, as should an excited state of any system. One can construct algorithms 
which are formally exact using two distributions of configurations with positive and 
negative weights |5T], but they are inefficient and the scaling of the computational cost 
with system size is unclear. 

The fixed-node approximation E3] provides a way to evade the sign problem. 
(In a 3D system, the nodal surface is the (3iV — l)-dimensional surface on which the 
wave function is zero and across which it changes sign.) The fixed-node approximation 
is equivalent to placing an infinite repulsive potential barrier on the nodal surface of the 
trial wave function which is sufficiently strong to force the wave function to be zero on 
the nodal surface. In effect we solve the Schrodinger equation exactly within each pocket 
enclosed by the nodal surface, subject to the boundary condition that the wave function 
is zero on the nodal surface. The infinite repulsive potential barrier has no effect if the 
trial nodal surface is placed correctly but, if it is not, the energy is always raised. It 
follows that the DMC energy is always less than or equal to the VMC energy with the 
same trial wave function, and always greater than or equal to the exact ground-state 
energy. 

The fixed-node DMC algorithm described above is extremely inefficient and a 
vastly superior algorithm can be obtained by introducing an importance sampling 
transformation EH ESI- Consider the mixed distribution, 



/(R,t) = vI/T(R)$(R,t), (12) 

which has the same sign everywhere if and only if the nodal surface of $(R, t) equals 
that of \I/t(R). Substituting in equation ([s]) for $ we obtain 
df 1 

where the 3A^-dimensional drift velocity is defined as 

v(R) = ^t'(R)Vr^t(R) . (14) 



The three terms on the right-hand side of equation (13) correspond to diffusion, drift 
and branching processes, respectively. The importance sampling transformation has 
several consequences. First, the density of configurations is increased where |\E't| is 
large, so that the more important parts of the wave function are sampled more often. 
Second, the rate of branching is now controlled by the local energy which is normally 
a much smoother function than the potential energy. This is particularly important for 
the Coulomb interaction, which diverges when particles are coincident. The importance 
sampling transformation, together with an algorithm that imposes /(R, t) > 0, ensures 
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that \1/t and $(R, t) have the same nodal surfaces, as can be seen in equation (12). 
The importance samphng transformation also reduces the statistical error bar on the 
estimate of the energy and leads to a zero variance property analogous to that in VMC 
The importance-sampled imaginary-time Schrodinger equation may be written in 
integral form: 

/(R, t) = j G(R ^ R', t - t') d'^ , (15) 

where the Green's function G(R R',t — t') is a solution of equation (13) satisfying 
the initial condition G(R ^ R',0) = 5(R — R'). The exact Green's function can be 
sampled using the Green's function Monte Carlo (GFMC) algorithm developed by Kalos 
and coworkers [56l EZl ESI EH |59] . 

Let us interpret /(R, t) as the probability distribution of a discrete population of 
P configurations with positive weights: 

/(R, t) = (j2 wp{t) S[R - R,{t)]j , (16) 

where the pth configuration at time t has position Rp(t) in configuration space and 
weight Wp(t), and the angled brackets denote an ensemble average. Using equation (15), 
the evolution of /(R, t) to time t + r yields 

/(R,t + r)= (j2Mt)G[R^Rp{t),T]^ 

= (T.wp{t + T)6[R-R,{t + T)]y (17) 

The dynamics of the configurations and their weights is governed by the Green's 
function. 

The GFMC algorithm is computationally expensive, but considerably faster 
calculations can be made using an approximate Green's functions which becomes exact 
in the limit of infinitely small time steps. Within the short-time approximation 

G{R ^ R', r) ~ G'st(R ^ R', r) = Gd(R ^ R', t)Gb{R ^ R', r) , (18) 

where 

Go(R^R'.r)=p^exp(-t^^^:^^I^) (19) 

is the drift-diffusion Green's function and 

Gb{R ^ R',r) = exp (^-^ [^l(R) + ^l(R') - 2Et]) (20) 

is the branching factor. 

The process described by G'd(R ^ R', t) is simulated by making each configuration 
R' in the population drift through a distance rv(R'), then diffuse by a random distance 
drawn from a Gaussian distribution of variance r. Each configuration is then copied 
or deleted in such a fashion that, on average, Gb(R configurations continue 

from the new position R. When using the short time approximation, configurations 
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occasionally attempt to cross the nodal surface but such moves may simply be rejected. 
The short time approximation leads to a dependence of DMC results on the time step. 
It is important to investigate the size of the time step dependence, and it is common 
practice to extrapolate the energy to zero time step: see figure [5j It turns out that 
Gst does not precisely satisfy the detailed-balance condition, but it is standard practice 
to reinstate detailed balance by incorporating an accept-reject step. The importance- 
sampled fixed-node fermion DMC algorithm was first used by Ceperley and Alder in 
their ground-breaking study of the homogeneous electron gas (HEG) [3j . 

It can be seen that the reference energy E-^ appears in the branching factor of 
equation (20). By adjusting the reference energy during the simulation we may keep the 
total population close to a target value, preventing the population from either increasing 
exponentially or dying out. An example of the behaviour of the total population and 
the reference energy can be seen in figure [l] pj. 

Another important aspect of practical implementations is that the particles are 
normally moved one at a time in both VMC and DMC algorithms. The trial wave 
function can usually be evaluated more rapidly when a single particle has been moved 
than if all particles have been moved, and a longer time step can be employed for an 
equivalent time-step error. The correlation length of the local energy is shorter for 
single-particle moves and overall the efficiency is considerably increased |UU] . 

The initial configurations are normally taken from a VMC calculation and 
equilibrated within DMC for a period of imaginary time. The importance-sampled DMC 
algorithm generates configurations asymptotically distributed according to /(R) = 
\EfT (1^)00(1^), where 0o is the ground state of the Schrodinger equation subject to the 
fixed-node boundary condition. Noting that Hcj)^ = -Eq^o everywhere (except on the 
nodal surface where 0o = 0) the fixed-node DMC energy can be evaluated using the 
formula 

(MmT) _ IfiR)EUR)dR 



M 



M 



E^l(R.). (22) 



=1 



Some example DMC data are shown in figure [Tj 



3. Trial wave functions 



Trial wave functions are of central importance in VMC and DMC calculations because 
they introduce importance sampling and control both the statistical efficiency and 
accuracy obtained. The accuracy of a DMC calculation depends on the nodal surface of 
the trial wave function via the fixed-node approximation, while in VMC the accuracy 
depends on the entire trial wave function. VMC energies are therefore more sensitive 
to the quality of the trial wave function than DMC energies. 
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Figure 1. DMC data for a silane (SiH4) molecule, with the ions represented by 
pseudopotentials. The upper panel shows the fluctuations in the population of 



configurations arising from the branching process used to simulate equation (20). The 



reference energy, Et, is altered during the run to control the population. Specifically, 
the reference energy is set to return the population to the target population (128,000 
configurations) on the same time-scale as the autocorrelation period of the energy data 
[T]. The total energy is shown in the lower panel as a function of the move number. 
The black line shows the instantaneous value of the local energy averaged over the 
current population of configurations, the red line is the reference energy and the 
green line is the best estimate of the DMC energy as the simulation progresses. The 
configurations at move number zero are from the output of a VMC simulation, and 
the energy decays rapidly from its initial VMC value of about -6.250 a.u. and reaches 
a plateau with a DMC energy of about -6.305 a.u. The data up to move 1000 are 
deemed to form the equilibration phase, and are discarded. 



3.1. Slater-Jastrow wave functions 

QMC calculations require a compact trial wave function which can be evaluated rapidly. 
Most studies of electronic systems have used the Slater-Jastrow form, in which a pair 
of up- and down-spin determinants is multiplied by a Jastrow correlation factor, 



^sj(R)= e'^^^ det k(rl 



det 



(23) 
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where e*^ is the Jastrow factor and det V'n(i'I) is a determinant of single-particle orbitals 
for the up-spin electrons. The quality of the single-particle orbitals is very important, 
and they are often obtained from density functional theory (DFT) or Hartree-Fock (HP) 
calculations. Note that the spin variables themselves do not appear in equation (23). 
Formally the sum over spin variables in the expectation values in equations ([T]) and (21 ) 
has already been performed and the single determinant with spin variables is replaced 
by two determinants of up- and down-spin orbitals whose arguments are the up- and 
down-spin electron coordinates Rf and R^, respectively. This is explained in more detail 
in reference [1]. 

The Jastrow factor is taken to be symmetric under the interchange of identical 
particles and its positivity means that it does not alter the nodal surface of the trial 
wave function. The Jastrow factor introduces correlation by making the wave function 
depend explicitly on the particle separations. The optimal Jastrow factor is normally 
small when particles with repulsive interactions (for example, two electrons) are close 
to one another and large when particles with attractive interactions (for example, an 
electron and a positron) are close to one another. 

The Jastrow factor can also be used to ensure that the trial wave function obeys the 
Kato cusp conditions [6Tj, which leads to smoother behaviour in the local energy £'l(R)- 
When two particles interacting via the Coulomb potential approach one another, the 
potential energy diverges, and therefore the exact wave function ^! must have a cusp so 
that the local kinetic energy — (1/2)\E'~^V^^ supplies an equal and opposite divergence. 
It seems very reasonable to enforce the cusp conditions on trial wave functions because 
they are obeyed by the exact wave function. Imposition of the cusp conditions is in fact 
very important in both VMC and DMC calculations because divergences in the local 
energy lead to poor statistical behaviour and even instabilities in DMC calculations due 
to divergences in the branching factor. 

Figure [2] shows the local energies generated during two VMC runs for a silane 
molecule in which the Si"^"*" and H"*" ions are described by smooth pseudopotentials. In 
figure |2]^a) the trial wave function consists of a product of up- and down-spin Slater 
determinants of molecular orbitals. The Kato cusp conditions for electron-electron 
coalescences are therefore not satisfied and the local energy shows very large positive 
spikes when two electrons are close together. Figure |2]^b) shows the effect of adding a 
Jastrow factor which satisfies the electron-electron cusp conditions. The large positive 
spikes in the local energy are removed and the mean energy is lowered. Some small spikes 
remain, and the frequency and size of the positive and negative spikes are roughly equal. 
These spikes arise from electrons approaching the nodes of the trial wave function, where 
the local kinetic energy diverges positively on one side of the node and negatively on 
the other side. 

The basic Jastrow factor that we use for systems of electrons and ions contains 
the sum of homogeneous, isotropic electron-electron terms m, isotropic electron-nucleus 
terms x centred on the nuclei and isotropic electron-electron- nucleus terms /, also 
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(a) 



(b) 
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Figure 2. Local energy of a silane (SiH4) molecule from a VMC calculation (a) using 
a Slater-determinant trial wave function and (b) including a Jastrow factor. 



centred on the nuclei [62]. We use a Jastrow factor of the form exp[J(R)], where 

N Anions N Nions N 

J({r,},{r,}) = + E T.Xi{ni) + E E/K^i/,^.7,^i.) , (24) 

i>j 1=1 i=l 1=1 i>j 

N is the number of electrons, Anions is the number of ions, Vij = — r^, rn = rj — r/, rj 
is the position of electron i and rj is the position of nucleus /. The functions u, x and / 
are represented by power expansions with optimisable coefficients. Different coefficients 
are used for terms involving different spins. Note that, even if the determinant part of 
the Slater- Jastrow wave function is an eigenfunction of the spin operator S"^, the use 
of different coefficients for parallel-spin and antiparallel-spin pairs of electrons generally 
leads to a trial wave function that is not an eigenfunction of 5*^. 

When using periodic boundary conditions, we often add a plane-wave term in the 
electron-electron separations, p{rij), which describes similar sorts of correlation to the 
u term. The u{rij) term, however, is cut off at a distance less than or equal to the 
Wigner-Seitz radius of the simulation cell, and the p term adds variational freedom 
in the corners of the simulation cell. Occasionally we add a plane-wave expansion in 
electron position, g(rj), and also occasionally add three-body electron-electron-electron 
terms. 

We have recently developed a more general form of Jastrow factor [HS] which allows 
the inclusion of higher order terms than those of equation (24), such as terms involving 
the distances between four or more particles. An example of the application of such 
a Jastrow factor to the H2 molecule is shown in figure |3j The molecular orbital was 
calculated within Hartree-Fock theory and VMC calculations were performed including 
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Figure 3. The difference between the VMC energy and the exact ground state energy 
against the variance of the VMC local energies on logarithmic scales for H2 at a bond 
length of 1.397453 a.u. obtained using Jastrow factors of increasing complexity. "HF" 
indicates a wave function consisting of a molecular orbital obtained from a Hartree- 
Fock calculation and "e-e-N" denotes a term in the Jastrow factor involving the three 
distances between two electrons and one proton, etc. 



Jastrow factors of increasing complexity. The Jastrow factor of equation (24) includes 



electron-nucleus (e-N etc.), e-e and e-e-N terms, but the additional reductions in energy 
from including the e-N-N and e-e-N-N terms are clearly visible in figure [3] 



3.2. Pairing wave functions 

Slater- Jastrow wave functions are not appropriate for all systems. For example, the 
strongly attractive interaction between electrons and holes within an effective-mass 
theory leads to the formation of excitons, which are not well described by a Slater- 
Jastrow wave function. A more appropriate wave function [M] is formed from the 
antisymmetrised product of identical electron-hole pairing functions ip, multiplied by a 
Jastrow factor, 

^sp(R.) = e'^(^) det [V'(rl, rj)] . (25) 

It is also possible to include additional orbitals for unpaired particles within this wave 
function. 
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3. 3. Multi- determinant wave functions 

Multi-determinant expansions have been used with considerable success over many 
decades within the quantum chemistry community. The trial wave function can be 
written as 

^md(R)= e-'(^) det [M^])] det [M^])] , (26) 

n 

where the Cn are coefficients. This method provides a systematic approach to improving 
the trial wave function, and there have been numerous applications of multi-determinant 
trial wave functions in QMC calculations for small molecules [551 [Ml EZ] ■ Such trial wave 
functions can capture near-degeneracy effects (also known as static correlation) . Multi- 
determinant wave functions are not in general suitable for large systems because the 
number of determinants required to retrieve a given fraction of the correlation energy 
increases exponentially with system size. An exception to this occurs if only a small 
region of the system requires a multi-determinant description. An example of a DMC 
calculation of this type is the study of the electronic states formed by the strongly 
interacting dangling bonds at a neutral vacancy in diamond by Hood et al. [30j. 

3.4- Backflow wave functions 

Additional correlation effects can be incorporated in the trial wave function using 
backflow transformations [681 169]. Consider a solid ball falling through a classical liquid. 
The incompressible liquid is pushed out of the way and it ffils in behind the ball to form 
a characteristic flow pattern. One can imagine that similar correlations occur as a 
quantum particle moves through a quantum fluid, as shown in figure |4j Much of this 
correlation can be captured in a Jastrow factor which, however, preserves the nodal 
surface of the wave function. The backflow motion gives an additional contribution 
which leaves its imprint on the nodes. Quantum backflow was discussed by Feynman 
and coworkers [Ml EH] for excitations in ^He and the effective mass of a ^He impurity 
in liquid "^He. Backflow wave functions have been used successfully in QMC studies 
of liquid He [TOl [12], the electron gas [HI [721 [H], hydrogen systems [3l], and various 
inhomogeneous systems [601 El] ■ 

The backflow wave functions we use (HOI can be written as 



^bf(R) = e-'(^) det [^,(rl + |,(R))] det [^'.(rj + ^,(R))] . (27) 

For a system of N electrons and A^ion classical ions we write the backflow displacement 
for electron i in the form 

N Afion N Mon 

= E vv^ij + E f^ii^ii + E E i^i'^^j + 0''ra) . (28) 

In this expression rjij = rjijij) is a function of electron-electron separation, /ij/ = fi{rij) is 
a function of electron-ion separation, and = ^{rij,rjj,rij) and = Q{rii,rjj,rij). 
We parameterise the functions t], /i, $ and using power expansions with optimisable 
coefficients [60] . 
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Figure 4. Effect of tlie motion of an electron (black, with the arrow showing the 
direction of motion) on the backflow-transformed coordinates of three opposite-spin 
electrons (red, green and blue). Circles with the same colour intensity correspond to 
the same instant in the motion. 



3. 5. Other wave functions 



The wave function types of equations (23), (25), (26), and (27) can be combined in 



various ways within the casino code |17] so that, for example, it is possible to use 
Slater- Jastrow-pairing-backflow wave functions, etc. Of course the range of possible 
wave functions could be extended by, for example, including Pfaffian wave functions 
[751 [76], etc. 



4. Optimisation of trial wave functions 

Optimising trial wave functions is a very important part of QMC calculations which can 
consume large amounts of human and computing resources. With modern stochastic 
methods it is possible to optimise hundreds or even thousands of parameters in the wave 
function. The parameters which can be optimised include those in the Jastrow factor, 
the coefficients of determinants in a multi-determinant wave function, the parameters 
in the backflow functions and the parameters in single-particle and pairing orbitals. 

The trial wave function used in a DMC calculation should ideally be optimised 
within DMC, but reliable and efficient methods to achieve this are still under 
development [771 178] . Minimisation of the DMC energy has been performed "by hand" 
for small numbers of parameters [6], [10]. Wave function optimisation within casino is 
performed by minimising the VMC energy or its variance. 

Optimising wave functions by minimising the variance of the energy is an old idea 
dating back to the 1930s. The first application within Monte Carlo methods may have 
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been by Conroy [79], but the method was popularised within QMC by the work of 
Umrigar and coworkers |80]. It is now generally believed that it is better to minimise 
the VMC energy than its variance, but it has proved more difficult to develop robust and 
efficient algorithms for this purpose. Since the trial wave function forms used cannot 
generally represent energy eigenstates exactly, except in trivial cases, the minima in the 
energy and variance do not coincide. Energy minimisation should therefore produce 
lower VMC energies, and although it does not necessarily follow that it produces lower 
DMC energies, experience indicates that, more often than not, it does. 



4-1. Variance minimisation 

The variance of the VMC energy is 



J, , /|*¥(R)]2|Bf(R)-B9]2<iR 



where a denotes the set of variable parameters. The minimum possible value of o"^(q;) 
is zero, which is obtained if and only if \E'° is an exact eigenstate of H. In practice 
the trial wave function forms used are incapable of representing the exact eigenstates. 
Nevertheless, the minimum value of cr^ia) is still expected to correspond to a reasonable 
set of wave function parameters. 

Minimisation of o"^(ck) is carried out via a correlated sampling approach in which a 
set of configurations distributed according to [\E'"'']^ is generated, where cxq is an initial 
set of parameter values |8j. o"^(q;) is then evaluated as 

where the integrals contain weights, w^^, given by 



al2 

T 



<o(R) = |^, (31) 
and Ey is evaluated using 

After generating the initial set of configurations, the optimisation proceeds using 
standard techniques to locate the new parameter values which minimise o"^(q;). With 
perfect sampling (j'^{ol) is independent of the initial parameter values ckq. For real 
(finite) sampling, however, one runs into problems because the values of w^^^ for different 
configurations can vary by many orders of magnitude if cx and ckq differ substantially. 
During the minimisation procedure a few configurations (often only one) acquire very 
large weights and the estimate of the variance is reduced almost to zero by a poor set of 
parameter values. This optimisation scheme is therefore often unstable, and in practice 
modified versions of it are used. 

The above scheme can be made much more stable by altering the weights w^o- 



A robust procedure is to set all the weights w° in equation (30) to unity, which is 
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reasonable because the minimum value of <y'^{cx) = is still obtained only if Ei^{R) is 
a constant independent of R, which holds only for eigenstates of the Hamiltonian. We 
call this the "unreweighted variance" minimisation method. The procedure is cycled 
until the parameters converge to their optimal values (within the statistical noise). For 
a number of model systems it was found that the trial wave functions generated by 
unreweighted variance minimisation iterated to self-consistency have a lower variational 
energy than wave functions optimised by reweighted variance minimisation 



If the Jastrow factor of equation (24) can be written in the form 



J{R)=Y.anfn{R) , (33) 



then it is possible to simplify the calculation of the variance of the VMC energy 
It can be shown that the unreweighted variance is a quartic function of the linear 
parameters a„ [82]. This has two advantages: (i) the unreweighted variance can be 
evaluated extremely rapidly at a cost which depends only on the number of parameters 
and is independent of the number of particles; and {ii) the unreweighted variance along 
a line in parameter space is a quartic polynomial. This is useful because it allows 
the exact global minimum of the unreweighted variance along the line to be computed 
analytically by solving the cubic equation obtained by setting the derivative equal to 
zero. 

The unreweighted variance minimisation method works well for optimising Jastrow 
factors, but it often performs poorly when parameters which alter the nodal surface 
of \E't are optimised. The problem is that the local energy Ei^ generally diverges for 
a configuration on the nodal surface. As the parameter values are changed during a 
minimisation cycle the nodal surface can move through a configuration, resulting in a 
very large (positive or negative) value of i?L, which adversely affects the optimisation. 
Such an effect would not occur when using the weights w'^^ because they go to zero on 
the nodal surface. We have developed two schemes which solve this problem. In the first 
scheme we limit the weights by replacing them with min(w°^,iy), so that the weight 
goes to zero on the nodal surface but can never become larger than a chosen value W . 
In the second scheme we use a weight which goes smoothly to zero as Ei^ deviates from 
an estimate of the energy. 

Unreweighted variance minimisation belongs to a wider class of wave-function 
optimisation methods which are based on minimising a measure of the spread of the set 
of local energies. Another measure of spread that we have used with considerable success 
for wave-function optimisation is the mean absolute deviation of the local energies of a 
set of configurations from the median energy, 

^ /[vi/^°(R)]^l^g(R)-^gMR 

/[^?°(R)]2dR ■ ^ ' 

In this expression, is the median value of the local energies evaluated with the 
parameter values ex. This is useful for optimising parameters that affect the nodal 
surface, because outlying local energies are less significant. 
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Jf..2. Energy minimisation 

A well-known method for finding approximations to the eigenstates of a Hamiltonian is 
to express the wave function as a linear combination of basis states gi, 

VI/T(R) = EA^^(R), (35) 

calculate the matrix elements Hij = {gi\H\gj) and Sij = {gi\gj), and solve the two-sided 
eigenproblem J2jHijf3j = EJ2jSij(3j by standard diagonalisation techniques. One can 
also do this in QMC [H], although the statistical noise in the matrix elements leads 
to slow convergence with respect to the number of configurations used to evaluate the 
integrals. 

Nightingale and Melik-Alaverdian reformulated the diagonalisation procedure 
as a least-squares fit rather than integral evaluation, which leads to much faster 
convergence with the number of configurations. Let us assume that the set {gi} spans 
an invariant subspace of H, which means that the result of acting H on any member of 
the set {gi} can be expressed as a linear combination of the {gi}, i.e., 

Hgi{R)=j2^ij9j{^) Vz. (36) 

i=l 

The eigenstates and associated eigenvalues of H can then be obtained by diagonalising 
the matrix Sij. Within a Monte Carlo approach we could evaluate the gi(R) and Hgi(R) 
for p uncorrelated configurations generated by a VMC calculation and solve the resulting 
set of linear equations for the Sij. For problems of interest, however, the assumption 
that the set {gi} span an invariant subspace of H does not hold and there exists no set of 



Sij which solves equation ( 36 ) . If we took p configurations and solved the set of p linear 
equations, the values of Sij would depend on which configurations had been chosen. 
To overcome this problem, a number of configurations M ^ p is sampled to obtain 
an overdetermined set of equations which can be solved in a least-squares sense using 
singular value decomposition. In fact Nightingale and Melik-Alaverdian recommended 



that equation (36) be divided by \E't(R) so that in the limit of perfect sampling the 
scheme corresponds precisely to standard diagonalisation. 

The method of Nightingale and Melik-Alaverdian works very well for linear 



variational parameters as in equation (35). The natural generalisation to parameters 
which appear non-linearly in \E't is to consider the basis of the initial trial wave function 
{go = \E't) and its derivatives with respect to the variable parameters. 

In its simplest form this algorithm turns out to be highly unstable because the first-order 



approximation in equation (37) is often inadequate. Umrigar and coworkers 
showed how this method can be stabilised. The details of the stabilisation procedures 
are quite involved and we refer the reader to the original papers [86l [87] for the details. 
The stabilised algorithm works well and is quite robust. The VMC energies given by this 
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method are usually lower than those obtained from any of the variance-based algorithms 
described in section |4.1[ although the difference is often small. 



5. QMC calculations within periodic boundary conditions 

QMC calculations for extended systems may be performed using cluster models or 
periodic boundary conditions, just as in other techniques. Periodic boundary conditions 
are preferred because they give smaller finite size effects. One can also use the standard 
supercell approach for systems which lack three-dimensional periodicity where a cell 
containing, for example, a point defect and a small part of the host crystal, are repeated 
periodically throughout space. Just as in other electronic structure methods, one must 
ensure that the supercell is large enough for the interactions between defects in different 
supercells to be small. 

When using standard single-particle-like theories within periodic boundary 
conditions such as density functional theory, the charge density and potentials are 
taken to have the periodicity of a chosen unit cell or supercell. The single particle 
orbitals can then be chosen to obey Bloch's theorem and the results for the infinite 
system are obtained by summing quantities obtained from the different Bloch wave 
vectors within the first Brillouin zone. This procedure can also be applied within HF 
calculations, although the Coulomb interaction couples the Bloch wave vectors in pairs. 
The situation with the many-particle wave functions described in section [3] is somewhat 
different. Although the many-particle wave function satisfies Bloch theorems [88| 189], it 
is not possible to perform a many-particle calculation using a set of k-points; one has to 
perform it at a single k-point. A single k-point normally gives a poor representation of 
the infinite-system result, so that one either chooses a larger non-primitive simulation 
cell, or averages over the results of QMC calculations at a set of different k-points [90] , 
or both. 

Many-body techniques such as QMC also suffer from finite size errors arising from 
long-ranged interactions, most notably the Coulomb interaction. Coulomb interactions 
are normally included within periodic boundary conditions calculations using the 
Ewald interaction. Long-ranged interactions induce long-ranged exchange-correlation 
interactions, and if the simulation cell is not large enough these effects are described 
incorrectly. Such effects are absent in local DFT calculations because the interaction 
energy is written in terms of the electronic charge density, but HF calculations show 
very strong effects of this kind and various ways to accelerate the convergence have 
been developed. The finite size effects arising from the long-ranged interaction can be 
divided into potential and kinetic energy contributions [HIl 02] • The potential energy 
component can be removed from the calculations by replacing the Ewald interaction by 
the so-called model periodic Coulomb (MFC) interaction [931 [9ll [95] . Recent work has 
added substantially to our understanding of finite size effects, and theoretical expressions 
have been derived for them [HH [92] , but at the moment it seems that they cannot entirely 
replace extrapolation procedures. 



Continuum variational and diffusion quantum Monte Carlo calculations 



19 



Kwee et al. [96] have developed an alternative approach for estimating finite size 
errors in QMC calculations. DMC results for the three-dimensional HEG are used to 
obtain a system-size-dependent local density approximation (LDA) functional. The 
correction to the total energy is given by the difference between the DFT energies for 
the finite-sized and infinite systems. This approach appears promising, although it does 
rely on the LDA giving a reasonable description of the system. 

6. Pseudopotentials in QMC calculations 

The computational cost of a DMC calculation increases with the atomic number Z of 
the atoms as roughly Z^'^ ^71 EH] which makes calculations with Z > 10 extremely 
expensive. This problem can be solved by using pseudopotentials to represent the effect 
of the atomic core on the valence electrons. The use of non-local pseudopotentials within 
VMC is quite straightforward [99l llOOj . but DMC poses an additional problem because 
the use of a non-local potential is incompatible with the fixed-node boundary condition. 
To circumvent this difficulty an additional approximation is made. In the "locality 
approximation" |101] the non-local part of the pseudopotential Va\ is taken to act on the 
trial wave function rather than the DMC wave function, i. e., V^i is replaced by Vni\E'T- 
The leading-order error term in the locality approximation is proportional to (\EfT — 0o)^ 
|1U1] . where 0o is the exact fixed-node ground state wave function, although it can be 
of either sign, so that the variational property of the algorithm is lost. Casula et al. 
\102\ I103j have introduced a fully variational "semi-localisation" scheme for dealing with 
non-local pseudopotentials within DMC, which also shows superior numerical stability 
to the locality approximation. 

Currently it is not possible to generate pseudopotentials entirely within a QMC 
framework, and therefore they are obtained from other sources. There is evidence that 
HF theory provides better pseudopotentials than DFT for use within QMC calculations 
|104] . and we have developed smooth relativistic HF pseudopotentials for H to Ba and 
Lu to Hg, which are suitable for use in QMC calculations |105[ 11061 I107j . Another 
set of pseudopotentials for use in QMC calculations has been developed by Burkatzki 
et al. |108] . In the few cases where reliable tests have been performed |109t IllOj . the 
pseudopotentials of references I1U61 1107] and those of |108j have produced almost 
identical results, although those of references |105[ I106[ llOTj are a little more efficient 
as they have smaller core radii. 

7. DMC calculations for excited states 

The fixed-node DMC algorithm is useful for studying excited states because it gives the 
exact excited-state energy if the nodal surface of the trial wave function matches that 
of the exact excited state and it gives an approximation to the excited-state energy if a 
trial wave function with an approximate nodal surface is used. 

This can be proved as follows. The local energy calculated with the exact excited- 
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state wave function is equal to the exact excited-state energy throughout configuration 
space, and, by definition, the wave function is zero at the nodal surface and nowhere else. 
Hence within each nodal pocket the exact excited-state wave function is the ground-state 
solution of the Schrodinger equation subject to the boundary condition of being zero on 
the pocket boundary. Therefore the ground-state pocket eigenvalues are all equal to the 
exact excited-state energy, and the fixed-node DMC algorithm indeed gives the exact 
excited-state energy. 

An important difference from the ground state case is that the existence of a 
variational principle for excited state energies cannot in general be guaranteed, and 
it depends on the symmetry of the trial wave function In practice DMC works 

quite well for excited states [221 M, [lEl [ml EZl EH HH]. Ceperley and Bernu |115] 
have devised a method which combines DMC and the variational principle to calculate 
the eigenvalues of several different excited states simultaneously. However, this method 
suffers from stability problems in large systems. 

8. Scaling of computational effort with system size 

Over the accessible range of system sizes, the computational cost of a single configuration 
move in a VMC or DMC calculation is usually determined by the time taken to evaluate 
each of the 0{N) orbitals in the Slater part of the wave function at each of the electron 
positions p. If the delocalised orbitals are expanded in localised basis functions then 
the time taken to move a configuration scales as 0{N'^). However, the number of 
configuration moves required to achieve a given error bar on the total energy grows as 
0{N), because the variance of the energy is proportional to the system size. Hence the 
time taken to evaluate the total energy to within a given statistical error bar scales as 
0{N^). (Note that the time taken to evaluate the Slater determinants during the run 
scales as (9(A^^), but with a small prefactor. In fact, for the DMC method the scaling 
with system size is ultimately exponential due to correlations within the configuration 
population [2].) 

The scaling of the QMC methods can be improved by using localised orbitals, 
so that the number of nonzero orbitals to be evaluated at each electron position is 
independent of the system size |116t 1117] . In this case the CPU time required to achieve 
a given error bar on the total energy scales as 0{N'^) over the relevant range of system 
sizes. To maximise the localisation of the orbitals, the orthogonality constraint can be 
dropped, for it is irrelevant in QMC. However, it is not possible to "cheat" on the size 
of the orbital localisation regions in QMC, because this would compromise the high 
accuracy of the method. (The use of localised orbitals enables the use of sparse linear 
algebra to compute the Slater determinants, improving the scaling of this part of the 
algorithm by a factor of N as well.) 

In calculations of the energy per particle of a periodic crystal the number of moves 
required to achieve a given error per particle falls off as 0{N~^). Hence the CPU time 
required to achieve a given error bar on the energy per particle increases as 0{N) in 



Continuum variational and diffusion quantum Monte Carlo calculations 



21 



the standard algorithm and is roughly independent of the system size when localised 
orbitals are used. 

9. Sources of error and statistical analysis 

9.1. Sources of error in DMC calculations 

The potential sources of errors in DMC calculations may be summarised as follows. 

• Statistical errors. The standard error in the mean is proportional to 1/a/M, where 
M is the number of particles moves. It therefore costs a factor of 100 in computer 
time to reduce the statistical error bars by a factor of 10. On the other hand, 
a random error is much better than a systematic one as its size can normally be 
reliably estimated. 

• Fixed-node error. This is the central approximation of the DMC technique, and is 
normally the limiting factor in the accuracy of the results. 

• Time-step bias. The short time approximation leads to a bias in the / distribution 
and hence in expectation values. This bias is often significant and can be of either 
sign, but it can be largely removed by performing calculations for different time 
steps and extrapolating to zero time step or by simply choosing a small enough 
time step. An example of time-step extrapolation is shown in figure |5| 

• Population control bias. The / distribution is represented by a finite population of 
configurations which fluctuates due to branching. The population may be controlled 
in various ways, but this introduces a population control bias which is positive and 
falls off as the reciprocal of the population. In practice the population control bias 
is normally so small that it is difficult to detect |118[ E] . 

• Finite size errors within periodic boundary conditions calculations. It is important 
to correct for finite size effects carefully, as mentioned in section [5j 

• The pseudopotential approximation inevitably introduces errors. In DMC there is 
an additional error arising from the localisation |101] or semi-localisation |103] of 
the non-local pseudopotential operator. The localisation error appears to be quite 
small in the cases for which it has been tested [73] . 

9.2. Practical methods for handling statistical errors in QMC results 

Two main practical problems are encountered when dealing with errors in the QMC 
data: the data are serially correlated and the underlying probabihty distributions are 
non-Gaussian. The probability distribution of the local energies has \E — Eq\~'^ tails, 
where Eo is a constant. These tails arise from singularities in the local energy such as 
the divergence at the nodal surface \105\ 1106] , as shown in figure |6j In consequence, 
although the mean energy and its variance are well defined, the variance of the variance 
is infinity. For other quantities the problem may be even more severe; for example. 
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Figure 5. DMC energy against time step for a 64-electron ferromagnetic 2D hexagonal 
Wigner crystal at density parameter Ts — 50 a.u. with a Slater- Jastrow wave function. 
The solid line is a linear fit to the data. 



the probability distributions for the Pulay terms in the forces described in section 11.2 
decay as \F - Fo|-^/^ so that the variance of the force is infinity |119] . Reasonably 
robust estimates of the errors can still be made, although it has to be accepted that 
they are not as well founded as for Gaussian statistics. 
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Figure 6. Variation in the local energy Ei^ of a silane (SiH4) molecule as an electron 
moves through the nodal surface at a; = 0. The local energy diverges as 



The data produced by VMC and DMC calculations are correlated from one step 
to the next. The problem is very important in DMC because short time steps are 
used to reduce the effect of the approximation in the Green's function. The simulation 
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effectively produces only one independent data point per correlation time, so that the 
estimate of the statistical error obtained on the assumption that the data points are 
independent is too small. We use the "blocking method" to obtain an estimate of the 
error. In this approach adjacent data points are averaged to form block averages |120] . 
This procedure is carried out recursively so that after each blocking transformation the 
number of data points is reduced by one half. An example of blocking is shown in 
figure [7} The computed value of the standard error increases with the number of 
blocking transformations k until a limiting value is reached when the block length starts 
to exceed the correlation time. The standard error in the mean is estimated by the 
value of A on the plateau. Because the sizes of the error bars on QMC expectation 
values are themselves approximate estimates, apparent outliers in QMC data can be 
more common than one might expect on the basis of Gaussian statistics. 
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Figure 7. Blocking analysis of data for an (all-electron) lithium atom. The blocking 
analysis indicates that the true standard error in the mean is about A = 2.6 x 10~^ 
a.u., which is reached at about blocking transformation k = 10, while the raw value is 
Ao = 7.0 X 10-s a.u. 

10. Evaluating other expectation values 

As mentioned in section [T| VMC and DMC can be used to calculate expectation values 
of many time-independent operators, not just the Hamiltonian. Typical quantities of 
interest are particle densities, pair correlation functions and one- and two-body density 
matrices, all of which can be evaluated using the casino code. It is not possible to obtain 
unbiased expectation values directly from the DMC distribution, /(R), for operators 
which do not commute with the Hamiltonian (which includes all of the quantities 
mentioned in the previous sentence). Unbiased (within the fixed-node approximation) 
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estimates can be obtained as pure expectation values, 
/0o(R)i0o(R) dn 



{A) 



(3J 



Pure expectation values can be obtained using a variety of methods: the approximate 
(but often very accurate) extrapolation technique [58], the future walking technique 
|12H 1122] , which is formally exact but statistically poorly behaved, and the reptation 
QMC technique of Baroni and Moroni |123j . which is formally exact and well behaved, 
but quite expensive. The extrapolation technique can be used for any operator, but 
the future walking and reptation techniques are limited to spatially local multiplicative 
operators. 

Here we shall illustrate the use of the extrapolation technique [SH] to calculate 
the charge density of a Wigner crystal. The pure estimate of the charge density p is 
approximated as 

Pext - 2pDMC - PVMC- (39) 

The errors in both the VMC and DMC charge densities pvMC and pdmc are linear in 
the error in the trial wave function, but the error in the extrapolated estimate pext is 
quadratic in the error in the wave function. 
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Figure 8. Charge density of a triangular antiferromagnetic Wigner crystal at density 
parameter Ts = 30 a.u., plotted along a line between a pair of nearest-neighbour lattice 
sites. Two different wave functions are used: wave function f was optimised by variance 
minimisation, while wave function 2 was optimised by energy minimisation. The inset 
shows the extrapolation with wave function 1 at the minimum in greater detail. 



At low densities the HEG freezes into a Wigner crystal to minimise the electrostatic 
repulsion between electrons. The charge density of a 2D Wigner crystal 1124] close 
to the crystallisation density is shown in figure [81 VMC, DMC and extrapolated results 
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are shown for two different trial wave functions. It can be seen that the dependence 
of the extrapolated estimate on the trial wave function is much smaller than for the 
raw VMC and DMC estimates, so we may have more confidence in the extrapolated 
estimate of the charge density. 



11. Energy differences and energy derivatives 

In electronic structure theory one is almost always interested in the differences in energy 
between systems. All electronic structure methods for complex systems rely for their 
accuracy on the cancellation of errors in energy differences. In DMC this helps with 
all the sources of error mentioned in section |9] except the statistical errors. Fixed-node 
errors tend to cancel because the DMC energy is an upper bound, but even though 
DMC often retrieves 95% or more of the correlation energy, non-cancellation of nodal 
errors is the most important source of error in DMC results. 



11.1. Energy differences in QMC 

Correlated sampling methods allow the computation of the energy difference between 
two similar systems with a smaller statistical error than those obtained for the individual 
energies [HI]. Correlated sampling is relatively straightforward in VMC, and a version 



of it is described in section 4A^ in the context of optimising wave functions by variance 
minimisation. 



11.2. Energy derivatives (forces) in QMC 

Atomic forces are useful for relaxing the structures of molecules and solids, calculating 
their vibrational properties, and for performing molecular dynamics (MD) simulations. 
It has proved difficult to develop accurate and efficient methods for calculating atomic 
forces within QMC, although considerable progress has been made in recent years. 
Difficulties have arisen in obtaining accurate expressions for DMC forces which can 
readily be evaluated and in the statistical properties of the expressions, which are not 
as advantageous as those for the energy. 

According to the Hellmann-Feynman theorem (HFT), the derivative of the energy 
with respect to a parameter A in the Hamiltonian is 

where the primes denote derivatives with respect to A. This expression is valid when ^ 
is an exact eigenstate of H. 

Unfortunately the HFT is not normally applicable within QMC because the wave 
functions are approximate. Exact expressions for the VMC and DMC forces must 
therefore contain additional Pulay terms which depend on \E''rp. To define the force 
properly it is therefore necessary to define and evaluate \l/'rp. 
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The DMC algorithm solves for the ground state of the fixed-node Hamiltonian 
exactly and therefore the HFT holds. Unfortunately the fixed-node Hamiltonian 
is different from the physical Hamiltonian because it contains an additional infinite 
potential barrier on the nodal surface of which forces the DMC wave function 0o 
to go to zero. As A varies, the nodal surface, and hence the infinite potential barrier, 
moves, giving a contribution to fl' |125[ 11261 1127] which depends on and and is 
classified as a Pulay term. 

The Pulay terms arising from the derivative of the mixed estimate of the energy 
of equation (21) contain 0q, the derivative of the DMC wave function. This quantity 
cannot readily be evaluated, and the approximation 



(41) 



00 

has normally been used [HSl M M [Ml [1321 UMl [123 [IMl [I3S| • However, it leads to 
errors of first order in (\E't — 0o) and {f^'^ — cf)'^)^ therefore its accuracy depends sensitively 
on the quality of \E't and ^^'rp, and in practice this approximation is often inadequate. 
The pure DMC energy, 

/ (i)oH(f)Q dR 



(42) 



/ 0000 dYi 

is equal to the mixed DMC energy. Forces may also be calculated within pure DMC, 
and although this is more expensive it brings significant advantages. The derivative E'jy 
contains the derivative of the DMC wave function, 0q. However, Badinski et al. |127] 
showed that 0o can be eliminated from the pure DMC expression, giving the exact result 

/ 0000 00 ^^Vo dR 1 / 0000 ^i^l Vr^tI^t dS 
J(l)Q(l)odR 2 /0o0orfR 

where dS denotes an element of the nodal surface. Unfortunately it is not 
straightforward to evaluate integrals over the nodal surface. The nodal surface integral 
can be converted into a volume integral in which 0q does not appear using an 



E' 



(43) 



approximation with an error of order (\E't 
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This expression is readily calculable if one generates configurations distributed according 
to the pure (0o) and variational (^E't) distributions. The approximation is in the Pulay 
terms, which are smaller in pure than in mixed DMC and, in addition, the approximation 
in equation (44) is second order compared with the first-order error in equation (41). 
Equation (44) satisfies the zero variance condition; if \E't and \E''rp are exact the variance 
of the force obtained from equation (44) is zero. Equation (44) has been used to obtain 
very accurate forces in small molecules \135\ I119j . The calculation of accurate DMC 
forces is still in its infancy, but it does appear that equation (44) offers a very promising 
way forward. 
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12. Conclusions 

QMC methods provide a framework for computing the properties of correlated quantum 
systems to high accuracy within polynomial time [2], facilitating applications to large 
systems. They can be applied to fermions and bosons with arbitrary inter-particle 
potentials and external fields. These intrinsically parallel methods are ideal for utilising 
current and next-generation massively parallel computers. Their accuracy, generality 
and wide applicability suggest that they will play an important role in improving our 
understanding of the behaviour of large assemblies of quantum particles. 

It is believed |136j that a complete solution to the fermion sign problem may be 
impossible, and any exact fermion method may be exponentially slow on a classical 
computer. Accurate quantum chemistry techniques such as the "gold standard" coupled 
cluster with single and double excitations and perturbative triples [CCSD(T)] have been 
applied with considerable success to correlated electron problems but, although they are 
also polynomial time algorithms, their cost increases much more rapidly with system 
size than for QMC methods. DFT methods have proved extremely useful in describing 
correlated electron systems, but there are many examples where the accuracy of current 
density functionals has proved wanting. It is important to remember that trial wave 
functions for QMC calculations could be improved by developing new wave function 
forms and better optimisation methods, whereas improving approximate DFT methods 
requires the development of better density functionals, which seems likely to be a much 
harder problem. 

These considerations motivate the development of approximate QMC methods such 
as those described in this review. Although the basics of the DMC algorithm used 
by Ceperley and Alder in 1980 |3j have remained unchanged, enormous progress has 
been made in using more complex trial wave functions and in optimising the many 
parameters in them. There is every reason to believe that the current high rate of 
progress will continue for many years to come. Although these QMC methods will 
remain approximate, it is clear that they can deliver highly accurate results provided 
the trial wave functions are accurate enough. Development of sophisticated computer 
packages [32] such as the casino code jlTl I1U7] should help to promote these methods. 
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